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Abstract — This paper presents the application of a particle 
filter for data assimilation in the context of puff-based dis- 
persion models. Particle filters provide estimates of the higher 
moments, and are well suited for strongly nonlinear and/or non- 
Gaussian models. The Gaussian puff model SCIPUFF, is used 
in predicting the chemical concentration field after a chemical 
incident. This model is highly nonlinear and evolves with variable 
state dimension and, after sufficient time, high dimensionality. 
While the particle filter formalism naturally supports variable 
state dimensionality high dimensionality represents a challenge 
in selecting an adequate number of particles, especially for 
the Bootstrap version. We present an implementation of the 
Bootstrap particle filter and compare its performance with the 
SCIPUFF predictions. Both the model and the Particle Filter 
are evaluated on the Dipole Pride 26 experimental data. Since 
there is no available ground truth, the data has been divided in 
two sets: training and testing. We show that even with a modest 
number of particles, the Bootstrap particle filter provides better 
estimates of the concentration field compared with the process 
model, without excessive increase in computational complexity. 
Keywords: Data Assimilation, Particle Filter, Chem-Bio 
Field Test, Dispersion Model. 

I. Introduction 

There is an increase in requirement for accuracy and com- 
putational performance in atmospheric transport and diffusion 
models used in critical decision making in the context of chem- 
ical, biological, radiological, and nuclear (CBRN) incidents. 
The output (field concentrations and dosages) of the dispersion 
models is used directly to guide decision-makers, and as an 
input for higher fusion levels, such as situation and threat 
assessment. Therefore the accuracy of the models as well as 
the time of delivery of the forecasts plays an important part 
in decision making. 

Gaussian dispersion models have been extensively studied 
and used in assessing the impact of CBRN incidents. They 
have gained popularity due to their straightforward theoreti- 
cal approach, and due to their relatively low computational 
complexity [1]. 

For accurate CBRN output, one cannot rely solely on 
mathematical models or on measurements recorded by the 
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sensors on the field, because of their uncertainties. Thus, for 
better estimates and lower uncertainty, a fusion step, called 
Data Assimilation, is necessary to combine the model forecasts 
and the measurements. 

In Data Assimilation, the estimation of the unknown system 
states given the underlying dynamics of the system and a 
set of observations may be framed as a filtering, smoothing 
or prediction problem. Given a fixed discrete time interval, 
{ti,t2, ■ ■ •*jv} > over which observations are available, the 
problem of filtering is to find the best state at time tk given 
all the observations prior to and including tk- The smoothing 
problem is to find the best state at time tk given all the 
observations up to time t^, where tk < ij\r- For tk > £tv 
the prediction problem is to forecast the state of the system at 
time tk using all the observations in the given interval. 

For a linear system, under the assumption of Gaussian 
probability distributions, the problem of estimating the states 
of the system has an exact closed form solution given by 
the Kalman Filter [2]. If the probability distributions are non- 
Gaussian or the system is nonlinear, in general no closed-form 
solution is available and different assumptions and approxima- 
tions have been made for quasi-optimal solutions maintaining 
both accuracy and tractability. The systems considered here 
are in general nonlinear, but the assumption that the process 
and observation uncertainties can be adequately modeled as 
Gaussian is made. 

The nonlinear filtering problem has been extensively studied 
and successfully employed in many applications, with various 
methods provided in the literature. Among the best understood 
and most frequently cited are the Extended Kalman Filter 
(EKF), the Ensemble Kalman Filter (EnKF), and more recently 
the Unscented Kalman Filter (UKF), and the Particle Filter 
(PF). The EKF is historically the first, and still the most 
widely adopted approach to nonlinear estimation problems. 
It is based on the assumption over small time increments, 
nonlinear system dynamics can be accurately modeled by a 
first-order Taylor series expansion [3], 

The PF uses a sampling approach, estimating the posterior 



probability distribution, including its higher order moments, by 
propagating and updating a number of particles, without the 
assumption of Gaussian statistics [4]. The variable and high 
dimensionality of the state vector, which poses a problem to 
the standard nonlinear assimilation techniques, can be dealt 
with using Particle Filters [5], Daum, et al. [6] showed that a 
carefully designed Particle Filter should mitigate the curse of 
dimensionality for certain filtering problems. 

This paper presents an implementation of the Bootstrap 
Particle Filter to assimilate chemical concentration readings 
of the Dipole Pride 26 experiment [7] into the SCIPUFF 
dispersion model. The Particle Filter takes into account the 
uncertainty due to the data errors in the observed meteorology. 
The results show that the particle filter, with a modest number 
of particles, provides better estimates of the concentration field 
compared with the process model, underlying the importance 
of meteorological data accuracy in CBRN incidents. 

Data assimilation based on sampling techniques for CBRN 
incidents and numerical weather prediction (NWP), such as 
Particle Filter [5], Unscented Kalman Filter [8], [9] or En- 
semble Kalman Filter [10], [11], become more and more 
accessible as parallel computing becomes mainstream with 
the introduction of multi-core processors and multiple CPU 
computers. Sampling techniques such as Monte Carlo analysis 
[12], [13] or ensemble method [14], [15] have been used 
before to account for uncertainty due to data errors. 

The Dipole Pride 26 (DP26) field experiment and the 
SCIPUFF dispersion model are presented in Section [II] The 
Bootstrap Particle Filter is introduced in Section [III] and its 
implementation for this particular problem is described in 
Section [IV] Numerical results on the CBRN scenario are 
presented in Section [V] and the conclusions and future work 
are discussed in Section [VT] 

II. Dipole Pride 26 and SCIPUFF 

The Dipole Pride 26 field experiment has been designed to 
validate transport and diffusion models at mesoscale distances 
[7]. The experiment has been conducted at Yucca Flat, Nevada 
where gaseous sulfur hexafluoride (SF6) has been released in 
a series of instrumented trials, of which only 17 provide useful 
puff dimension information. 

Three lines of sensors, Fig[T| each with 30 whole air- 
samplers have been used to record average concentrations 
every 15 min. The chemical sensors, known also as sequential 
bag samplers (12 bags per sensor), are placed at 1.5 m above 
the ground and spacing along lines is about 250 m. The 
total sampling time of the chemical sensor is 3 hr, hence 
total experimental duration for each trial to be monitored is 
3.5 hr. This is achieved by delaying the acquisition of the last 
line of sensors with 30 min. Six continuous SF6 analyzers, 
TGA-4000, have been used to record high-frequency variations 
of the gas concentration field, but their placement does not 
offer enough resolution to capture the crosswind structure of 
the chemical plume, and for the purpose of this paper these 
readings have been excluded from the study. 
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Figure 1. Process Model: chemical dosage plot after 3hr at 1.5m 



Eight Meteorological Data (MEDA) stations were used to 
provide surface-based meteorological measurements and two 
pilot balloon stations and one radiosonde provided the upper- 
air meteorological profiles. The meteorological measurements 
recorded provide information about the wind direction and 
speed, temperature, pressure and humidity. The variation of the 
wind field is given by the standard deviation of hourly wind 
speeds and directions recorded at the MEDA stations which 
are about 0.5 - 2 m s _1 and 10° - 30° respectively [16]. 

In this paper we are focusing only of trial number six, 
which took place in Nov 12, 1996 and where a mass of 
11.6 kg of SF6 had been release from 6 m height at the North 
dissemination site N2 as in FigQ] The dispersion model used 
to predict the chemical concentrations and dose at the sensor 
locations is SCIPUFF [17]. SCIPUFF is a Lagrangian puff 
dispersion model that outputs the chemical concentrations at 
specified locations as cumulative contributions of Gaussian 
puffs. SCIPUFF is the dispersion engine incorporated into 
Hazard Prediction and Assessment Capability (HPAC) tool, 
used by the Defense Threat Reduction Agency (DTRA) for 
situation assessment of CBRN incidents. Besides the mean 
concentration field, SCIPUFF is also providing the uncertainty 
in the concentration value due to the stochastic nature of the 
turbulent diffusion process. The simulation is driven by the 
meteorological data, which in this case is provided by the 
MEDA stations, from which a wind field is created. 

The observed wind data are fit in a least square sense, 
using variational methodology. An initial gridded wind field 
is constructed from the observation data by interpolation. 
Adjustments are then made to the initial 3D interpolated wind 
field vectors so as to satisfy conservation of mass in a way that 
also minimizes an integral function of the difference between 



the initial and adjusted fields. 

Existing literature provides evaluation studies of SCIPUFF 
with Dipole Pride 26 [16], [18], [19]. The results reported 
show that SCIPUFF predictions are within a factor of 2 of ob- 
servations about 50%, where the model evaluation was based 
on maximum dosage anywhere on the sampling lines [16]. 
The studies emphasize the importance of wind field in the 
chemical concentration prediction accuracy and conclude that 
SCIPUFF prediction performance is comparable or better than 
other dispersion models. 

III. BOOSTRAP PARTCILE FILTER 

In this section we focus our attention on sequential state es- 
timation using sequential Monte Carlo (SMC). SMC is known 
also as bootstrap filtering, particle filtering, the condensation 
algorithm, interacting particle approximations and survival of 
the fittest [20]. 

Consider the following nonlinear system, described by the 
difference equation and the observation model: 



x fc+1 = f(x fe )+w fc 
z fc = h(x fc )+v fc 



(1) 

(2) 



Denote by Z k = {z^l < i < k} the set of all observations 
up to time k, conditionally independent given the process with 
distribution p(z k \x k ). Also, assume that the state sequence 
Xfe is an unobserved (hidden) Markov process with initial 
distribution p(xq) and transition distribution p(x k+ i\x k ). 

Our aim is to estimate recursively the posterior distribution 
p(x k \Z k ) and expectations of the form [21]: 



E[z k ] = E[h(x k )} = / h(x fc )p(x fc |Zfc)dx fc 



(3) 



In particle filters, the posterior distribution p(x k \Z k ) is 
approximated with N weighted particles {xj: ■ i w t ;}iLi> 
given by 



p(x k \Z k ) 



N 

£ 



Wl'6 m(Xfc) 



(4) 



(i) 

where x. k ' are the particles drawn from the importance 
function or proposal distribution, w k are the normalized 
importance weights that sum up to one and 8 (t)(x k ) denotes 

the Dirac-delta mass located in x!l . Thus the expectation 
of a known function h(x k ) with respect to p(x k \Z k ) is then 
approximated by 

f N 

/ h(x fc )p(x fc |Z fc )dx fc « ^2 w k )h ( x k l) ) ( 5 ) 

i—l 

Suppose that we cannot sample from the posterior distribu- 
tion and use an importance sampling approach to sample from 
a proposal distribution q(x k \Z k ). Hence, we can recursively 
update the weights: 



p(z k+1 \x l k+1 )p(xl 



• J k+l 



q{x 



fc+il A fc' 



Z fc+ i) 



(6) 



After a few iterations all but one particle will have negligible 
weights. Hence the algorithms allots time to update a large 
number of weights with no effect in our sampling, effect 
known as degeneracy problem. This problem can be overcome 
by adding a resampling strategy to Sequential Importance 
Sampling (SIS) Algorithm [Q 

Algorithm 1 Sequential Importance Sampling (SIS) algorithm 

l: Draw 4 ~ ^(xfelxj^Zfe) for i = 1...JV 

2: Compute the importance weights 

,„i _ .,a p(sfc+iK+i)pK+il*fc) 
w k+ i - w k g(4+i | 4iZfc+l) 

3: Normalize the importance weights w k - 



£f =1 K 
4: Multiply/Discard particles {x k Li}iLi with respect to 

(i) 

high/low importance weights w^+i to obtain N new 
particles {x k K}fL l with equal weights. 

The importance function plays a significant role in the 
particle filter. Usually, it is difficult to find a good proposal 
distribution especially in a high dimensional space. One may 
choose to approximate it using different methods, thus differ- 
ent flavor of particle filter. One of the simplest importance 
function is given by 



q(xl +1 \x l k ,Z 



k+lj 



pK+iK) 



(7) 



This implementation is called the Bootstrap Particle filter. 
By substituting (O back into (O the new weight update 
equation becomes: 



•>'k+i ex w l k p{z k+1 \x\ +1 ) 



(8) 



The resampling step reduces the sample impoverishment 
effect but introduces new practical problems: it limits the op- 
portunity to have an efficient parallel algorithm, and particles 
with high weights are statistically selected many times and 
leads to a loss of diversity among the particles [20]. 

IV. Particle Filter Implementation 

Three types of uncertainties [22] are present in the model 
predictions: model uncertainty due to the inaccurate rep- 
resentation of the chemical and dynamical processes, data 
uncertainty due to the errors in data used to drive the model 
and random turbulence of the atmosphere. 

The model uncertainty is not completely reducible, and in 
the SCIPUFF prediction case it is estimated. The uncertainty 
due to the variability of the atmosphere cannot be further 
reduced and the uncertainty due to data errors it is usually 
high, more than 50% of the total uncertainty [23], but it can 
be minimized. The data errors considered in this paper are 
meteorological data: wind speed and wind direction. The errors 
are due to the unrepresentative sitting of the wind sensors 
in the field and sensor accuracy and calibration [19]. This 
information is rarely available and for this study a standard 
deviation of 0.5 m s _1 for the wind speed and 5° for the 
wind direction has been considered. 



A FORTRAN implementation of the Particle Filter has 
been specially created for SCIPUFF to run in parallel on 
the cluster hosted at the Center for Computational Research 
at University at Buffalo. This implementation of the particle 
filter is designed to account for the uncertainty in the wind 
sensor readings while coping with the challenges of the data 
assimilation for the CBRN incidents using Gaussian puff 
models: variable dimensionality and high dimensionality [5]. 

Compared with the maximum dosage approach [16], the 
evaluation method used in this paper is to compare the 
predicted dosage after every 15 min with the corresponding 
observed dosage. 

Since there is no ground truth, the samples provided in the 
DP26 have been divided to two sets: the training set used to 
perform the data assimilation, composed of Line 1 and Line 
2 of sensors, and the testing set, Line 3 of sensors, used for 
performance evaluation. 

Due to the spatial and temporal distance between the wind 
readings all the considered independent random variables 
described by a Gaussian distribution with mean given by 
the nominal sensor reading and uncertainty given by the 
standard deviation mentioned above. Hence the particles, each 
representing a SCIPUFF instance, are propagated using wind 
field generated from data sampled from this distribution. Each 
particle outputs a dose field cC which depends on the wind 
field; here j is the sensor number, i is the particle number and 
k is the time step. The estimated dosage dj is given by the 
following relation: 



N 



d t (k) = J2<d)(k) 



(9) 



Here w\ is given by Eq.®. The conditional probability present 
in Eq.® is unknown since the authors could not find infor- 
mation regarding the uncertainty in the concentration readings 
for the whole air-samplers. The concentrations readings of 
the sensors have been assumed to be independent random 
variables due to the spatial and temporal distances. The 
likelihood have been approximated with a Gaussian function 
given by: 

p{d {k)\d){k)) = M{d){k)\dj{k),v{k)) (10) 

Here dj(k) is the observed dosage at the j th sensor at time 
k and v(k) is the sample variance of the difference between 
observed dosages and predicted ones over all the particles. 
Any predicted dosage less than 1 ppt-hr (including zero) has 
been set to 1 ppt-hr and all the observed dosages less than 10 
ppt-hr have been ignored. Hence the total number of dosage 
values to be compared is 47. 

V. Numerical Results 

Both the process model predictions, using nominal wind 
sensor readings, and the particle filter predictions with per- 
turbed wind field have been compared against the observed 
dosages on the third line of sensors. 

To evaluate the performance of the particle filter compared 
with the process model, we consider the following statistical 



measures [24]: FB - fractional bias, MG - geometric mean 
bias, NMSE - normalized mean square error, VG - geometric 
variance, FAC2 - fraction of predictions within a factor 2 of 
observations and FAC3 - fraction of predictions within a factor 
3 of observations. 
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Here D represents the observed dosage and D p the pre- 
dicted dosage. Since we are dealing with random samples, 
50 Monte Carlo runs have been performed in assessing the 
performance of the particle filter. The numerical results based 
on the performance metrics have been tabulated in Table U 

Overall the particle filter provides improved estimates com- 
pared to the process model and all the performance measures 
are better on average. Since the predicted and observed values 
vary by several orders of magnitude, MG, VG, FAC2 and 
FAC3 are more appropriate. While we do not see a significant 

Table I 
Numerical Results - 50 Monte Carlo runs 





Process Model 


Particle Filter 


PF 95% CI 


FB 


1.426 


1.405 


1.390- 1.419 


MG 


0.456 


0.443 


0.402 - 0.484 


NMSE 


8.658 


8.466 


8.249 - 8.683 


VG 


86.75 


61.20 


53.65 - 68.74 


FAC2 


6.38% 


11.45% 


9.31% - 13.58% 


FAC3 


6.38% 


22.17% 


19.45% - 24.89% 



reduction in the geometric bias, the geometric variance and the 
fraction of factor 2 and 3 give a significant improvement of the 
particle filter over the process model. This is consistent with 
the scatter plots shown in Fig|2]and FigfJ] The particle filter 
is able to alleviate the under-prediction and over-prediction 
problem present in the dispersion models. 

The result reiterates the need of accurate meteorological ob- 
servations and provides support for the use of data assimilation 
in the CBRN incidents. 

VI. Conclusion 

The paper presents an implementation of the Bootstrap Par- 
ticle Filter to correct the SCIPUFF concentration predictions 
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Figure 2. Process Model with nominal wind field 
Scatterplot for Particle Filter 
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Figure 3. Particle Filter with perturbed wind field (best run) 



using concentration measurements provided by the chemical 
sensors deployed in the field. Due to the high uncertainty in 
the meteorological input, the CBRN dispersion models should 
be accompanied by a data assimilation step to account for this 
uncertainty and improve the predictions. For the Dipole Pride 
26 the particle filter has doubled the number of predictions 
within a factor of 2 of the observations and it has tripled the 
ones within a factor of 3 of the observations. 

A complete evaluation of the particle filter on all the Dipole 
Pride 26 trials and simulations with uncertainty for also the 
temperature, pressure and relative humidity, are planned as 
future work. 
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